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Abstract 

This  paper  proposes  an  alternate  method  for  finding  several  Pareto  opti¬ 
mal  points  for  a  general  nonlinear  multicriteria  optimization  problem,  aimed 
at  capturing  the  tradeoff  among  the  various  conflicting  objectives.  It  can 
be  rigorously  proved  that  this  method  is  completely  independent  of  the  rel¬ 
ative  scales  of  the  functions  and  is  quite  successful  in  producing  an  evenly 
distributed  set  of  points  in  the  Pareto  set  given  an  evenly  distributed  set 
of  ‘weights’,  a  property  which  the  popular  method  of  linear  combinations 
lacks.  Further,  this  method  can  be  easily  extended  in  case  of  more  than  two 
objectives  while  retaining  the  computational  efficiency  of  continuation-type 
algorithms,  which  is  an  improvement  over  homotopy  techniques  for  tracing 
the  tradeoff  curve. 
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Computer  Applications  in  Science  and  Engineering  (ICASE),  NASA  Langley  Research 
Center,  Hampton,  VA  23681-0001. 
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1  Introduction 


A  wide  variety  of  problems  arising  in  design  optimization  of  en¬ 
gineering  systems  are  essentially  multicriteria  in  nature  (see,  for  example, 
Eschenauer,  Koski  and  Osyczka  [1]  and  Statnikov  and  Matusov  [2]).  For 
example,  a  typical  bridge-construction  design  might  involve  simultaneously 
minimizing  the  total  mass  of  the  structure  and  maximizing  its  stiffness.  How¬ 
ever,  it  is  highly  improbable  that  these  conflicting  objectives  would  both  be 
‘extremized’  by  the  same  design,  hence  some  tradeoff  between  the  objec¬ 
tives  functions  is  desired  to  ensure  an  efficient  design.  Mathematically  such 
a  multicriteria  optimization  problem  can  be  written  as: 


min  F(x) 
xec  ^  ’ 


/i(®) 

fn{^) 


n  >2, 


. . .  {MOP) 


where 

C  =  {x  :  h{x)  =  0, g{x)  <  0,a  <  x  <  b}, 

F  :  H->  3?",  /i  :  3?"®  and  g  :  3?^  i->  3?”'  are  twice  continuously 

differentiable  mappings,  and  a  G  (3?U  {-oo})^,6  G  (3f  U  {oo})^,  N  being 
the  number  of  variables,  n  the  number  of  objectives,  ne  and  ni  the  number 
of  equality  and  inequality  constraints. 


Since  no  single  x*  would  generally  minimize  every  fi  simultaneously,  a 
concept  of  optimality  which  is  useful  in  the  multiobjective  framework  is  that 
of  Pareto  optimality,  as  defined  below: 


Definition:  A  point  x*  G  C  is  said  to  be  (globally)  Pareto  optimal  or 
a  (globally)  efficient  point  or  a  non-dominated  or  a  non-inferior  point  for 
(MOP)  if  and  only  if  /^x  G  C  such  that  F{x)  <  F{x*)  with  at  least  one 
strict  inequality  (the  <  implies  term-by-term  inequality). 


The  shadow  minimum,  F*,  is  defined  as  the  vector  containing  the  indi¬ 
vidual  global  minima,  f*,  of  the  objectives,  i.e.. 


L/nJ 
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(We  assume  here  and  henceforth  the  existence  of  a  minimum  for  each  of 
our  objectives.)  The  shadow  minimum  could  thus  be  attained  only  in  the 
rare  case  when  a  single  x  minimizes  all  the  objective  functions.  However,  in 
practical  situations,  the  best  we  can  hope  for  is  to  get  close  to  the  shadow 
minimum  and  assure  that  there  is  an  agreeable  trade-oflf  among  the  multiple 
objectives. 

Very  often  in  engineering  applications  the  desired  solution  is  a  whole 
collection  of  Pareto  optimal  points,  representative  of  the  entire  spectrum 
of  efficient  solutions.  Thus  ideally,  the  desired  solution  is  the  entire  Pareto 
optimal  set,  which  can  be  obtained  for  some  small  problems  which  allow 
themselves  to  be  treated  parametrically,  resulting  in  closed-form  expressions 
for  the  Pareto  set  (see  Lin  [3]).  More  recently,  attempts  have  been  made 
to  approximate  the  entire  curve  of  Pareto  optimal  solutions  in  bi-objective 
problems  using  techniques  which  trace  the  curve  of  parametrized  optima 
(see  Rakowska,  Haftka  and  Watson  [4],  Rao  and  Papalambros  [5],  Lundberg 
and  Poore  [6]).  The  next  best  solution,  which  is  very  acceptable  in  most 
applications,  is  a  set  of  Pareto  optimal  points  obtained  by  combining  the 
multiple  objectives  into  a  single  objective  function  and  minimizing  the  single 
objective  over  various  values  of  the  parameters  used  to  combine  the  objec¬ 
tives.  For  example,  it  is  possible  to  generate  a  set  of  Pareto  optimal  points 
by  minimizing  a  convex  combination  of  the  objectives,  o?-  F{x),  over  x  £  C, 
where  a  >  0  (component-wise)  and  J2i=i  and  performing  the  mini¬ 

mization  for  different  choices  of  a  (see,  among  many  others,  Koski  [7]).  In 
this  article,  we  propose  a  new  method  for  generating  Pareto  optimal  points 
which  is  at  least  as  efficient  as  these  methods  and,  unlike  the  techniques  for 
tracing  the  curve  of  Pareto  optimal  solutions,  can  be  applied  to  problems 
with  more  than  two  objectives. 


2  Preliminaries 

First  let  us  introduce  some  terminology: 

Convex  Hull  of  Individual  Minima  (CHIM):  Let  x*  be  the  re¬ 
spective  global  minimizers  of  fi(x),i  =  l,...,n  over  x  £  C.  Let  F*  = 
F(x*),t  =  1, . . . ,  n.  Let  $  be  the  n  x  n  matrix  whose  column  is  F*  —  F*. 
Then  the  set  of  points  in  3?”  that  are  convex  combinations  of  F*,  i.e., 
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{$?i;  :  w  G  3*?",  Z!”=i  >  0},  is  referred  as  the  Convex  Hull  of 

Individual  Minima. 

The  set  of  attainable  objective  vectors,  {F  =  F{x)  :  x  €  C}  is  denoted 
hy  F,  so  F  :  C  ^  F,  i.e.,  C  is  mapped  by  F  onto  F.  The  space  3?"  which 
contains  F  is  usually  referred  to  as  the  objective  space.  The  map  of  C 
under  F  in  the  objective  space  is  often  called  the  multi-loss  map^  (bi-loss 
map,  if  n  =  2).  We  shall  denote  the  boundary  of  F  by  dF.  The  set  of  all 
Pareto  optimal  points  is  usually  denoted  by  V.  The  complete  curve/surface 
of  Pareto  minima  (continuous  or  not)  is  often  referred  to  as  the  trade-off 
function  (see  p9,  Haimes,  Hall  and  Freedman  [8]). 

CHIM+:  Let  CHI  Moo  be  the  affine  subspace  of  lowest  dimension 
that  contains  the  CHIM.  Then  CHIM+  is  defined  as  the  smallest  simply- 
connected  set  that  contains  every  point  in  the  intersection  of  dF  and  CHIMoo 
More  informally,  consider  extending  (or  withdrawing)  the  boundary  of  the 
CHIM  simplex  to  touch  dF,  the  ‘extension’  of  CHIM  thus  obtained  is 
defined  as  CHIM+. 

Henceforth,  it  shall  be  assumed  that  the  objective  functions  have  been 
defined  with  the  shadow  minimum  shifted  to  the  origin,  so  that  all  the 
objective  functions  are  non- negative,  i.e.,  F{x)  is  redefined  as: 

F[x)  <r-  Fix)  -  F*. 

We  observe  that  in  Fig.l,  which  shows  the  set  F  in  the  objective  space,  the 
point  A  is  Fj*,  B  is  F2 ,  O  is  the  shadow  minimum  (and  the  origin),  the 
broken  line  segment  AB  is  the  CHIM,  while  the  ‘arc’  ACB  is  the  set  of  all 
Pareto  minima  in  the  objective  space;  alternately,  the  trade-off  curve.  In 
this  (and  any)  problem  with  n  =  2  (i.e.,  bi-objective),  CHIM  =  CHIM+ 
and  the  matrix  $  is  anti-diagonal. 


3  Central  Idea 

The  pivotal  idea  behind  our  approach  will  be  introduced  by  means  of  a  sim¬ 
ple  observation:  the  intersection  point  between  the  normal  emanating  from 

^This  terminology  is  widely  used  in  game  theory. 
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Figure  1:  A  typical  bi-Ioss  map 

any  point  in  the  CHIM  and  the  boundary  is  probably  a  Pareto  opti¬ 
mal  point;  the  point  of  intersection  closest  to  the  origin  is  a  Pareto  minimal 
point  (while  the  one  furthest  is  a  Pareto  maximal  point) .  We  say  ‘probably’ 
because  this  may  not  always  be  true,  e.g.,  when  the  boundary  is  ‘folded’ 
(see  Fig.2).  But  it  is  true  when  the  trade-off  surface  in  the  objective  space 
is  convex,  which  happens  in  almost  every  application  found  in  the  literature 
(see  for  example  the  problems  in  Refs.  1,  2  and  7). 

Given  a  convex  weighting  w,  represents  a  point  in  the  CHIM.  Let  n 
denote  the  unit  normal  to  the  CHIM  simplex  pointing  towards  the  origin; 
then  +  tn,t  represents  the  set  of  points  on  that  normal.  Then  the 
point  of  intersection  between  the  normal  and  the  boundary  of  closest  to 
the  origin  is  identical  to  the  solution  of  the  following  subproblem: 

maxt 

Xjt 

s.t.  ^w  -i-tn  =  F{x) 

h{x)  =  0  (NBIm) 

g{x)  <  0 
a  <  X  <  b. 

The  constraints  +  tn  =  F{x)  ensure  that  the  point  x  is  actually  mapped 
by  F  to  a  point  on  the  normal,  while  the  remaining  constraints  ensure  feasi¬ 
bility  of  X  with  respect  to  the  constrained  set  in  the  original  problem  (MOP). 
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Figure  2:  NBI  started  at  Q  converges  to  P  (locally  Pareto  optimal),  whereas 
the  corresponding  globally  efficient  point  would  have  been  P* . 

The  subproblem  above  shall  be  referred  to  as  the  NBI  subproblem,  often 
written  as  NBI^  (since  w  is  the  characterizing  parameter  of  the  subprob¬ 
lem),  and  solutions  of  these  subproblems  will  be  referred  to  as  NBI  points. 
The  idea  is  to  solve  NBI^i  for  various  w  and  find  several  points  on  the 
boundary  of  P,  effectively  constructing  a  pointwise  approximation  to  the 
part  of  the  boundary  containing  the  Pareto  minimal  set. 

As  indicated  earlier,  all  NBI  points  are  not  Pareto  optimal  points.  For 
biobjective  problems,  for  every  Pareto  optimal  point  there  exists  a  corre¬ 
sponding  NBI  subproblem  of  which  it  is  the  solution.  The  same  is  true  for 
n  >  3,  with  one  difference:  the  components  of  the  weight  w  for  NBIw  may 
not  add  up  to  1.  As  a  simple  example,  suppose  is  a  sphere  in  3?^  touching 
the  coordinate  axes,  for  simplicity.  Then  the  CHIM  simplex  is  the  triangle 
formed  by  joining  the  three  points  where  the  sphere  touches  the  axes.  Quite 
clearly,  CHIM  ^  CHIM+  and  there  exist  points  in  CHIM  +  \CHIM 
underneath  which  there  are  Pareto  optimal  points  on  the  sphere.  However 
since  these  points  are  not  in  CHIM,  they  do  not  satisfy  J2i  Wj  =  1.  Thus, 
by  solving  NBI^  for  J]"  tUj-  =  1,  a  portion  of  the  Pareto  set  might  be  over¬ 
looked  for  problems  with  n>  2.  However,  these  overlooked  points  are  likely 
to  be  ‘extremal’  Pareto  points  which  are  not  interesting  from  the  tradeoff 
standpoint,  which  is  our  primary  goal. 
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3.1  Some  details 

3.1.1  Structure  of  $ 

The  column  of  $  is  described  by 

$(:,*)  =  F«)-F*. 


Since  fi{x*)  =  /*,  clearly, 

i)  =  0. 

Furthermore,  if  x*  is  the  global  minimizer  of  fi{x),  then 

>  o>i 

Thus,  a  negative  element  in  position  {j,  k)  of  $  signifies  that  x*^  is  not  the 
global  minimizer  of  A(a;),  and  fk[x))  <  i-e.,  x]  improves  on  the 

current  local  minimum  of  /fc(.r).  This  very  fortunate  occurrence  can  help 
refine  the  local  minimum  of  an  objective  by  a  simple  examination  of 

Even  a  zero  element  of  $  in  an  off-diagonal  position,  say,  (j,k),  would 
signify  that  Xk  is  a  minimizer  of  both  fj{x)  and  //t(.'r),  which  could  make  Xk 
or  its  nearby  points  very  desirable  choices. 

3.1.2  Quasi-normal  instead  of  normal  direction 

The  idea  of  a  family  of  normals  intersecting  the  boundary  is  valid  even 
if  we  do  not  have  the  exact  normal  direction  to  the  CHIM  simplex,  but 
some  quasi-normal  direction  n  which  points  towards  the  origin.  ‘Shooting’ 
a  family  of  quasi-normal  rays  towards  the  boundary  also  gets  us  our  desired 
boundary  points.  In  practice  we  choose  our  quasi-normal  direction  to  be  an 
equally- weighted  linear  combination  of  the  columns  of  $,  multiplied  by  -1 
to  ensure  that  it  points  towards  the  origin.  Explicitly, 


n  =  — $e. 


where  e  is  the  column  vector  of  all  ones. 

The  quasi-normal  component  defined  as  above  has  the  property  that  the 
NBI  point  found  for  a  certain  w  is  completely  independent  of  the  scales  of 
the  objective  functions.  In  other  words,  if  is  re-solved  with  the  ob¬ 

jective  functions  rescaled  by  arbitrary  factors,  the  NBI  point  found  remains 
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unchanged.  This  fact  will  be  proved  later. 

C4iven  that  $  has  nonnegative  components  as  discussed  in  the  previous 
subsection,  it  is  clear  that  all  components  of  $e  are  nonnegative. 

Even  though  a  quasi-normal  direction  will  be  used  in  our  computations, 
we  prefer  to  retain  the  name  ‘NBI’,  rather  than  change  it  to  something  like 
‘QNBP.  The  authors  hope  that  this  misnomer  would  not  be  considered  too 
harshly. 


3.1.3  Further  insight:  NBI  and  goal  programming 

Since  t  is  being  maximized  in  the  NBI  subproblem  and  +  th  =  F{x), 
X  £  C,  this  maximization  subproblem  attempts  to  find  a  feasible  point  x  as 
far  from  a  ‘target’  point  as  possible,  with  n  >  0  (componentwise)  guar¬ 
anteeing  nonincrease  in  the  components  of  F{x)  relative  to  the  components 
of 

This  is  similar  to  goal  programming.  If  we  take  the  Pareto  set  to  be 
convex  in  the  objective  space,  ‘equality  goal  programming’^  can  be  thought 
of  as  NBI  where  the  direction  n  is  one  of  the  canonical  basis  vectors  e,-  (i.e. 
with  1  in  the  position  and  0  in  the  rest).  To  be  precise,  the  subproblem 
N BIw  with  n  =  Cj  has  the  same  solution  as  the  following  goal  programming 
problem: 

min/i(a;) 

X 

s.t.  fj{x)  =  ($uO(j),  j  =  1, . . . ,  n,  j^i 

xec, 

where  (#w)(i)  denotes  the  component  of  the  vector  ^w. 

Though  posing  the  goals  as  equalities  is  untraditional,  this  kind  of  sub¬ 
problem  above  for  obtaining  a  Pareto  optimal  point  is  discussed  in  Lin  [3] 
and  [9]. 

^Preferring  to  goal  programming  where  the  goal  constraints  are  equalities  instead  of 
inequalities. 


3.1.4  Efficiently  solving  the  subproblems 

The  following  simple  observation  plays  a  key  role  in  lowering  the  computa¬ 
tional  expense  involved  in  solving  the  NBI  subproblems: 

Consider  weight  vectors  w  and  w  such  that  w  is  ‘close  to’  w,  i.e.,  ||u;  — mH 
is  ‘small’  in  some  norm.  Then,  it  is  reasonable  to  expect  that  the  solution 
of  NBI^  and  the  solution  of  NBIu,  are  ‘close  to  each  oth¬ 

er’.  Assume  that  we  have  solved  NBIu,  first  and  already  have  the  point 
Then  with  {x*,i*)  as  the  starting  point  for  solving  NBI^,  the  NBI 
subproblem  solver  can  be  expected  to  converge  in  a  few  iterations  at  a  fast 
local  convergence  rate^.  It  is  this  aspect  of  our  algorithm  that  gives  it  the 
flavor  of  a  continuation-type  method. 

Since  we  already  have  the  individual  minima  of  the  functions,  i.e.,  the 
vertices  of  the  CHIM  simplex,  we  start  at  xl  and  solve  a  ‘nearby  subprob¬ 
lem’,  and  then  a  subproblem  close  to  the  one  just  solved,  and  so  on. 

Let  us  illustrate  the  above  strategy  for  a  biobjective  problem.  The 
weights  w  for  only  two  objectives  can  be  expressed  as  [/3, 1  -  /3],  ^  €  [0, 1]. 
We  can  take  (3  to  assume  the  values: 

[0,  S,2S,...,  k5] 

where  (5  <  1  is  the  (uniform)  spacing  between  two  consecutive  wi  values 
and  k  =  /[|],  i.e.  the  greatest  integer  <  Then  the  set  of  ‘uniformly 
distributed’  weights  is  given  by  [/?,  1  -  0],  where  0  ranges  over  the  values  as 
above. 

Now,  assuming  ^  <<  1  (say  S  =  0.05),  the  minimizer  of  f2{x),  i.e.,  is 
expected  to  be  a  small  perturbation  of  the  solution  to  the  NBI  subproblem 
with  w  =  [5, 1  -  (5].  Thus  the  NBI  subproblem  with  this  w  is  solved  starting 
from  .1:21  and  its  solution  is  used  as  the  starting  point  for  solving  the  NBI 
subproblem  with  w  =  [2^,  1  -  2^],  and  so  on,  until  the  last  weight  is  reached. 

Of  course,  ‘ordering  the  subproblems’  may  not  be  so  obvious  for  prob¬ 
lems  with  more  that  two  objective  functions,  but  can  still  be  achieved,  as 
described  in  the  next  section. 

^Q-quadratic  if  e.xact  second  derivatives  are  used,  superlinear  if  a  secant  approximation 
like  BFGS  is  used. 
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4  Generating  w  and  ordering  the  subproblems  for 
more  than  two  objectives 

In  this  section,  we  shali  describe  a  (data)  structure  which  simultaneousiy 
enables  the  generation  of  weights  lo  and  ordering  the  subproblems  in  a  man¬ 
ner  amenable  not  only  to  efficient  solution  but  also  to  parallelization. 


4.1  Generating  w 

Let  us  assume  that  for  an  n-objective  problem,  Sj  >  0  is  the  uniform  spacing 
between  two  consecutive  Wj  values  (i.e.,  the  ‘stepsize’  on  the  component 
of  w)  for  j  =  1, . . . ,  n  -  1.  For  simplicity,  let  us  also  assume  that  ^  is  an 
integer. 


The  possible  values  that  can  be  assumed  by  wi  are 

[0,(5i,25i,...,l]. 

Define  mi  =  Then  the  possible  values  of  W2  corresponding  to  Wi  =  mi^i 
(all  the  Wi’s  must  add  up  to  1)  are 

[0,  S2, 2S2, . . . ,  ^2^2] 

where  k2^n^]  =  I[^=^]. 

Now  define  m2  =  ^.  Then  the  possible  values  of  W3  corresponding  to 
wi  —  miSi  and  W2  =  ^,282  are 

[0, 8z,  2^3, ... ,  k^Ss] 

where  k2  = 

Thus,  corresponding  to  Wi  =  miSi,  i  =  1, . . .,  j  -  1,  the  possible  values 
of  Wj  for  j  =■  2, ....  n  —  1  are 

[0,  Sj^  28 j , . . . ,  kj8j^ , 

where  .  . 
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Finally  the  last  component  of  iv  is  defined  as 

n—1 

=  Wi. 

i=l 

Clearly,  the  entire  data  structure  above  can  be  thought  of  as  a  tree  where 
the  number  of  children  varies  with  the  node  and  generation.  However,  a  tree 
structure  is  clearly  unnecessary  for  implementation;  all  that  requires  storage 
are  the  numbers  Sj.  However  the  tree  is  useful  as  a  conceptual  aid. 

Of  the  subproblems  generated  by  the  weights  in  the  above  tree,  n  (with 
w  =  €i)  are  already  solved  while  finding  F*.  Also  note  that  since  jj  is  not 
necessarily  an  integer  Vi  <  j,  the  spacings  between  ‘the  last  two’  values  of 
lOn  may  not  be  uniform. 

Special  case:  Equal  stepsizes  on  all  Wi 

Let  6i  =  S,  i  =  1, . .  .,n  —  1 

Also  assume  that  j  =  p  is  an  integer. 

As  before,  the  possible  values  of  wi  are 

[0,(5,2<5,...,1] 

Then  the  possible  values  of  Wj  corresponding  to  Wi  =  ruiSi,  f  =  1, . . .,  j  -  1 
for  j  =  2, ..  .,n  —  1  are 

[0,(5,25,...,(p- 

t  =  l 

As  before,  Wn  =  values  are  uniformly  spaced. 


4.2  Ordering  the  subproblems 

Each  path  from  the  root  of  the  tree  (the  topmost  node)  to  a  leaf  (a  member 
in  the  bottommost  generation)  represents  a  unique  weight  w.  It  should  also 
be  observed  that  the  w  vectors  are  already  ordered  on  the  basis  of  ‘nearness’ 
as  one  traverses  the  tree  breadthwise.  Thus  a  strategy  for  picking  the  order 
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of  the  subproblems  could  be  to  start  with  the  leftmost  one  (which  has  =  e„ 
and  is  already  solved)  and  solve  the  next  one  in  the  Wn-i  generation  (which 
is  Wn-i  =  Sn-i,Wn  =  1  -  5„_i),  then  the  next  one  in  the  Wn-i  generation 
(  Wn-i  =  26n-i,Wn  =  1  -  2Jn-i),  and  so  on  until  all  the  subproblems  for 
Wi  =  0,i—  1, . . n  -  2  have  been  solved.  Then  we  move  to  the  next  node 
in  the  w„_2  generation  (i.e.,  with  =  0,i  =  -  3,  w’n-2  =  ^-2) 

and  visit  all  the  children  of  this  node,  with  the  starting  points  of  the  NBI 
subproblems  chosen  as  the  corresponding  NBI  subproblem  solutions  at  the 
previous  node. 

This  is  where  the  scope  for  parallelization  comes  in.  The  solution  of 
the  first  subproblem  at  the  second  node  in  the  u>„_2  generation  didn’t  have 
to  wait  until  all  the  subproblems  in  the  first  node  were  solved.  The  first 
subproblem  in  the  second  node  of  the  Wn-2  generation  with  Wn-2  =  ^n-2i 
Wn-i  =  ^n-i,  Wn  =  I  -  ^n-2  “  ^n-1  could  be  solved  immediately  after 
solving  the  first  subproblem  in  the  first  node  with  Wn-2  —  0,  Wn-i  =  ^n-ii 
Wn  =  I-  Sn-i-  Thus  the  first  subproblem  in  the  second  node  can  be  solved 
in  parallel  with  the  second  subproblem  in  the  first  node,  ...,  and  the 
subproblem  in  the  second  node  can  be  solved  in  parallel  with  the  {k  +  1)*^ 
subproblem  of  the  first  node.  Further, the  k^^  subproblem  in  the  third  node 
can  be  solved  in  parallel  with  the  {k  +  subproblem  of  the  second  node, 
with  the  solution  of  the  fc**  subproblem  of  the  second  node  as  the  starting 
point,  and  so  on.  This  entire  process  of  efficient  parallelization  is  one  of  the 
topics  of  our  future  research. 


5  Relationship  between  the  NBI  subproblem  and 
minimizing  a  linecir  combination  of  the  objec¬ 
tives 

In  this  section  we  illustrate  how  the  NBI  subproblem  is  related  to  the  popular 
method  of  minimizing  a  convex  combination  of  the  objectives.  For  ease  of 
notation,  we  shall  assume  that  the  problem  only  has  equality  constraints, 
which  can  be  assumed  without  loss  of  generality®.  Let  a  G  (3?+  U  {0})”, 
ai  =  1,  denote  a  positive,  convex  weighting  of  the  objectives.  The 
weighted  linear  combination  problem  for  obtaining  a  Pareto  optimal  point 

^h(x)  can  be  thought  of  as  the  equality  constraints  augmented  by  the  active  set  of 
inequality  constraints  and  bounds 
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is  then  written  as 


min  F{x) 

X 

s.t.  h{x)  =  0.  (1) 

The  solution  of  a  problem  like  above  will  often  be  referred  to  as  an  LC point, 
and  the  problem  denoted  by  LCa-  The  ‘first  part’  of  the  KKT  conditions 
for  optimality^  of  {x*.  A*)  for  problem  (1)  states  that  the  gradient  of  the 
Lagrangian  with  respect  to  x  should  vanish  at  (a:*,  A*),  i.e., 

V,F(x>  +  V,Ma^*)A*  =  0  (2) 

Similarly,  if  w  denotes  the  vector  of  weights  in  NBIu,  (which  has  a  very 
different  meaning  from  the  weights  a,-  in  the  linear  combinations  subprob¬ 
lem),  the  NBI  subproblem  can  be  written  as 

min  —t 

X,t 

s.t.  F{x)  —  —  tn  =  0 

h{x)  =  0. 

Then  the  first  part  of  the  KKT  conditions  states  that  the  gradient  of  the 
Lagrangian  with  respect  to  {x.t)  should  vanish  at  (a;*,  t*,  A(^)*,  ,  i.e. 

V^F(.a;*)A(‘)*  -h  =  0  (3) 

-l  +  n^A(^)*  =  0, 

where  A^^'  G  3?”  represents  the  vector  of  multipliers  corresponding  to  the 
constraints  -f  tn  -  F{x)  -  0,  and  A^^^  G  3?’"^  denotes  the  multipliers  of 
the  equality  constraints  h{x)  =  0. 


Claim: 

Suppose  (.T*,f*,A(^)*,  A(^)*)  is  the  solution  of  NBIu,-  Now  define  the  com¬ 
ponents  of  the  vector  a  as 


a,-  = 


a!*’* 

eja!"*' 


^Karush-Kuhn-Tucker  conditions,  or  alternately  the  first  order  necessary  conditions  for 
optimcJity. 
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Then,  problem  (1)  with  the  above  convex  weighting  vector  a  has  the  solution 


1 


A(2)*]. 


Proof: 

Dividing  both  sides  of  (3)  by  the  scalar  jyi  Af  and  observing  that  h{x*)  = 
0,  the  equivalence  between  (2)  and  (3)  becomes  obvious. 

However,  quite  clearly,  if  for  some  i,  the  sign  of  A^^^  is  opposite  to  that 
of  Yli  Ap^*,  then  the  vector  a  has  a  negative  component  and  does  not  qual¬ 
ify  as  a  weight  for  problem  (1).  In  such  a  case,  either  the  Pareto  optimality 
of  the  NBI  point  A^^)*,  A^^)*)  is  questionable,  or  the  Pareto  point  lies 

in  a  nonconvex  part  of  the  Pareto  set^ 

f  1)* 

Also  observe  the  tacit  assumption  that  X-  ’  /  0. 


Just  as  the  analysis  above  suggests  a  method  for  obtaining  or  for  prob¬ 
lem  LCa  given  the  corresponding  solution  of  one  can  also  obtain  the 

NBI  point  corresponding  to  a  given  solution  of  problem  LCa  with  very  little 
effort. 


Suppose  (.T*,  A*)  solves  problem  LCa-  Let  {w,t*)  be  the  solution  of  the 
(n  -I-  1)  X  (ra  -|- 1)  linear  system 

+  th  =  F{x*) 


2  =  1 

Then  (a;*,  A*)  corresponds  to  the  solution  of  NBIyj  with  w  ^  w,  i.e.,  the 
solution  of  NBIyj  is 


(.-r*,r,A(i)*  =  ^,A(2)*  = 
Q'^  n 


a^h 


). 


Proof: 


"Pareto  points  in  nonvonvex  parts  of  the  Pareto  set  cannot  be  obtained  by  minimizing 
a  linear  combination  of  the  objectives,  a  proof  of  which  will  appear  in  a  future  article 
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Dividing  (2)  on  both  sides  by  a^h  (assumed  nonzero®)  and  observing 
that  defined  above  satisfies  =  1,  it  can  be  seen  that  the  first  part 

of  the  KKT  conditions  for  NBI^  holds.  Further  observing  that,  h{x*)  =  0 
and  tn  —  F{x*),  the  required  equivalence  between  LCa  and  NBI^ 

follows. 


6  Proof  of  independence  with  respect  to  function 
scales  using  the  quasi-normal 

In  this  section  we  shall  prove  that  the  NBI  point  found  using  the  quasi¬ 
normal  h  and  a  particular  w  is  independent  of  how  the  individual  functions 
are  scaled. 

Let  the  objective  functions  be  scaled  by  positive  scalars  Si  as 
fi{^)  ^  *  —  1,  .  .  .,  71. 

In  other  words,  if  s  is  the  vector  with  components  s,-  and  5  =  diag{s),  then 

Fix)  ^SF{x). 


Consequently 

V,F(.'r)  V,F{x)S 

The  quasi-normal  direction  h  =  -$e  after  scaling  becomes  =  -S^e. 
Claim: 

If  (.T*,  t*,  A(^)*,  solves  the  unsealed  NBI^  (i.e.  with  S  =  /„),  then 
(.T*,t*,5~^A^^)*,  A^^)*)  solves^  NBI^  with  the  functions  scaled  as  above. 

Proof:  Since  (a:*,t*,  Al^^*,  A^^l*)  solves  the  unsealed  NBIy,  (still  with 
only  equality  constraints  as  in  the  previous  section), 

V^F(.r*)A(^>*  +  V^h{x*)\^^>  =  0 

®  Since  a  has  nonnegative  components  (not  all  zero)  and  h  has  negative  components, 
the  assumption  holds. 

^Here  ‘solves’  means  ‘finds  a  stationary  point  of  the  nonlinear  programming  problem’. 
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=  1 

+ 1*  h  =  F{x*) 
h{x*)  =  0. 

The  first  equation  can  be  rewritten  to  state  that  the  following  holds: 

(V^F(.t*)5)  (5-^^^)*)  +  A(2)*  =  0.  (4) 

The  second  equation  implies 

=  1 

=  =  1. 

Since  5  =  5^,  the  above  is  the  same  as 

(e^(5$)^)(5-^A(i)*)  =  1.  (5) 

The  third  equation  can  be  rewritten  as 

^w  +  t*^e  =  F{x*) 

=  S^w  +  fS^e  =  SFix*).  (6) 

Clearly,  equations  (4), (5)  k  (6)  imply  that  (a;*,t*,5"^A(^)*,  solves 
NBIxu  with  the  functions  scaled  by  5. 

(QED) 

Tha  above  result  does  not  depend  on  e  being  the  vector  of  all  ones  and 
consequently  holds  if  n  is  scaled  by  a  factor,  say,  a  normalization  constant. 

The  above  result  suggests  that  no  matter  how  disparately  the  different 
functions  might  be  scaled,  NBI  with  the  quasi-normal  finds  a  set  of  points 
as  if  the  functions  were  all  scaled  to  the  same  order  of  magnitude. 

7  Advantages  of  using  NBI 

•  Finds  a  uniform  spread  of  Pareto  points:  Consider  any  method 
which  parametrically  combines  all  the  objective  functions  into  a  single 
objective  and  finds  efficient  points  by  minimizing  the  single  objective 
for  various  values  of  the  parameters.  Then,  in  general,  the  mapping 
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from  the  set  of  parameters  to  the  set  of  Pareto  optimal  points  is  not 
one-to-one.  Thus  it  might  so  happen  that  minimizations  over  several 
different  parameters  produces  the  very  same  point  each  time,  resulting 
in  fruitless  computational  expense-  this  is  never  the  case  with  NBL 
Moreover,  in  the  absence  of  convexity,  “Pareto-optimal  solutions  ob¬ 
tained  by  this  method  are  often  found  to  be  so  few,  or  the  correspond¬ 
ing  indexes  so  extreme,  that  there  seems  to  be  no  middle  ‘ground’  for 
any  compromise,  although  such  ‘ground’  may  actually  exist”  -  Lin  [9]. 
For  examples,  refer  to  Lin  [9],  Katopis  and  Lin  [10],  Lin  [11], 

The  interrelationship  between  the  linear  combinations  subproblem  and 
the  NBI  subproblem  provides  more  insight  into  why  the  linear  com¬ 
binations  technique  fail  to  give  a  uniformly  distributed  set  of  Pareto 
optima.  By  fixing  the  weights  cv  in  subproblem  LC,  we  are  in  effect  fix¬ 
ing  the  multipliers  of  the  corresponding  NBI  subproblem,  thus  partly 
restricting  the  solution  of  the  resultant  subproblem.  Even  if  the  Pareto 
optima  are  uniformly  distributed  in  the  Pareto  set,  there  is  no  reason 
why  the  corresponding  multipliers  have  to  be  uniformly  distributed. 
However,  the  weights  in  the  linear  combinations  approach  are  often 
very  desirable  because  they  give  an  idea  of  the  relative  importance  of 
the  objectives.  Thus  obtaining  the  NBI  points,  which  are  uniformly 
distributed,  and  then  finding  the  corresponding  weights  a  for  the  NBI 
points  can  be  very  useful. 

•  Advantages  over  homotopy  techniques:  NBI  improves  over  ho- 
motopy/continuation  techniques  for  tracing  the  curve  of  Pareto  opti¬ 
mal  solutions,  like  the  one  discussed  in  Rakowska,  Haftka  &  Watson 
[4],  in  the  following  respects: 

—  It  is  applicable  for  more  than  tioo  objectives  For  a  multiobjective 
problem  with  more  than  two  objectives  the  homotopy  parameter 
is  not  a  scalar  and  the  associated  differential  equations  turn  out 
to  be  a  system  of  nonlinear  partial  differential  equations  with  not 
readily  available  boundary  conditions,  rather  than  an  ordinary 
initial  value  problem,  as  in  the  case  of  two  objectives.  Thus 
extending  homotopy  techniques  to  handle  n  >  2  is  very  difficult. 
On  the  other  hand,  NBI  can  be  extended  to  handle  more  than 
two  objectives  quite  easily. 

—  It  does  not  require  exact  Hessian,  Even  for  a  biobjective  problem, 
solving  the  homotopy  boundary  value  problem  requires  exact  sec- 
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ond  derivative  information  (i.e.,  the  Hessian  of  the  Lagrangian), 
whereas  the  NBI  subproblem  solver  requires  only  a  secant  ap¬ 
proximation  of  the  Hessian  like  BFGS. 

-  It  can  bypass  tracking  active  sets.  For  problems  with  inequality 
constraints  or  explicit  bounds  on  variables  ,  homotopy  techniques 
need  to  keep  track  of  the  changes  in  active  sets  of  the  inequality 
constraints  or  bounds  meticulously  in  course  of  the  Initial  Value 
Problem  integration,  which  can  present  difficulties  if  the  number 
of  inequalities  or  bounds  is  large.  On  the  other  hand  an  interior 
point  NLP  solver  used  as  the  NBI  subproblem  solver  would  handle 
this  situation  quite  efficiently,  and  would  not  have  a  problem  with 
frequent  changes  in  the  active  set. 

•  NBI  improves  on  other  traditional  methods  like  goal  programming  in 
the  sense  that  it  never  requires  any  prior  knowledge  of  ‘feasible  goals’. 
It  improves  on  multilevel  optimization  techniques  from  the  tradeoff 
standpoint,  since  multilevel  techniques  usually  can  only  improve  only 
a  few  of  the  ‘most  important’  objectives,  leaving  no  compromise  for 
the  rest. 


8  A  note  on  local  versus  global 

It  is  worth  observing  here  that  unless  the  individual  minima  of  the  objec¬ 
tives  obtained  at  the  outset  are  guaranteed  to  be  global  minima  there  is  no 
guarantee  that  NBI  produces  solutions  that  are  globally  Pareto  optimal.  In 
fact,  as  pointed  out  earlier,  there  is  no  guarantee  that  every  solution  pro¬ 
duced  by  NBI  is  even  locally  Pareto  optimal.  All  we  can  conjecture  is  that 
if  the  individual  minima  of  the  functions  happen  to  be  global  minima  and 
if  we  start  NBI  from  every  point  on  CHIM  -{■  UCHIM,  the  set  of  points 
thus  obtained  would  contain  all  the  globally  Pareto  optimal  points,  provided 
the  boundary  of  if  is  not  ‘folded’.  However,  even  when  ‘folded’,  the  point 
obtained  could  be  locally  Pareto  optimal  (see  fig.2) . 

Not  being  able  to  find  globally  Pareto  optimal  points  is  a  drawback  inher¬ 
ent  in  every  method  that  finds  a  large  number  of  efficient  points  of  MOP. 
In  homotopy  methods,  it  would  involve  finding  the  global  minimum  of  one 
of  the  two  objectives  in  the  very  beginning.  In  methods  which  find  efficient 
points  by  minimizing  a  single  objective,  only  a  global  minimum  of  the  scalar- 
ized  objective  would  correspond  to  a  globally  efficient  point.  Even  though 
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Figure  3:  The  normal  from  N  intersects  the  boundary  at  E,  but  values  of 
the  objectives  at  P  are  each  less  than  the  corresponding  values  at  E,  hence 
E  is  not  Pareto  optimal. 

a  local  minimum  would  still  correspond  to  a  locally  efficient  point,  there  is 
no  guarantee  that  minimizing  a  single  objective  produces  a  local  minimum 
since  most  single  objective  optimization  algorithms  only  converge  to  a  KKT 
point  of  the  problem,  i.e.  one  which  only  satisfies  necessary  conditions  for 
being  a  minimum  and  could  thus  well  be  a  saddle-point  (and  not  even  a 
local  minimum!). 

Given  the  shortcomings  of  global  optimization  applied  to  nonconvex  prob¬ 
lems,  we  choose  to  remain  satisfied  with  the  Pareto  optimal  points  obtained 
by  NBI,  in  spite  of  the  fact  that  they  may  not  be  globally  efficient. 


9  A  Numerical  Example 


Below  is  a  brief  account  of  employing  NBI  techniques  on  a  small  biobjective 
problem,  stated  below: 


min 

X 


(a:)  =  +  x|  +  x\  +  xl 

f2ix)  —  3Xi+2X2  —  ^  +  0.01  [x4  —  X5) 


xi  +2x2  —  X3  —  0.5  X4  +  X5  =  2 
4  xi  —  2  X2  +  0.8  X3  +  0.6  X4  +  0.5  x|  =  0 
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+  ^2  +  +  ^4  +  ^5  ^  10- 

NBI  using  the  actual  normal  to  the  CHIM  simplex  (a  line  segment  in 
this  case)  was  run  three  times  on  this  problem  for  21  different  weight  vec¬ 
tors  w:  first  on  the  original  problem,  then  on  the  problem  with  /i  scaled  by 
a  factor  of  5  (to  increase  the  disparity  between  the  scales  of  the  objective 
functions)  and  then  with  /i  scaled  by  a  factor  of  10.  The  results  in  the 
following  table  shows  that  NBI  succesfully  produces  a  uniformly  distributed 
set  of  Pareto  optimal  points  even  if  the  objective  functions  are  scaled  dis- 
parately.  (Note  that  the  tabulated  Pareto  optimal  function  values  have  all 
been  converted  back  to  their  original  scales.) 


Weights 

{wi,W2) 

Objective  values 
(original  scale) 

Objective  values 
(/i  scaled  by  5) 

Objective  values 
(/i  scaled  by  10) 

0.00  ,  1.00 

10.0000  ,  -4.0111 

10.0000,  -4.0111 

10.0000, -4.0111 

0.0.5 , 0.95 

9.4717  ,  -3.7902 

9.5249,  -3.8126 

9.5270,  -3.81.35 

0.10 , 0.90 

8.9453  ,  -3. .5665 

9.0499,  -3.6113 

9.0.541,  -3.6131 

0.15 , 0.85 

8.4208  ,  -3..3.398 

8..5750,  -3.4069 

8..5812,  -3.4095 

0.20 , 0.80 

7.8985  ,  -3.1097 

8.1002,  -3.1991 

8.1083,  -3.2027 

0.25  ,  0.75 

7.3785  ,  -2.87.59 

7.6255,  -2.9876 

7.6.354,  -2.9921 

0..30  ,  0.70 

6.8612  ,  -2.6.381 

7.1508,  -2.7720 

7.1626,  -2.7773 

0..35 , 0.65 

6..3469  ,  -2..3958 

6.6763,  -2..5.517 

6.6897,  -2.5580 

0.40 , 0.60 

5.8.359  ,  -2.1483 

6.2020,  -2.3263 

6.2170,  -2..3.3.3.5 

0.45 , 0.55 

5..3286  ,  -1.8951 

5.7277,  -2.0950 

5.7442,  -2.10.32 

0..50  ,  0.50 

4.82.56  ,  -1.63.53 

5.2.537,  -1.8570 

5.2715,  -1.8661 

0..55 , 0.45 

4.3275  ,  -1.3679 

4.7799,-1.6112 

4.7989,-1.6213 

0.60 , 0.40 

3.83.53  ,  -1.0916 

4..306.3,  -1..3.562 

4.3263, -1.3672 

0.65 , 0.35 

.3..3499  ,  -0.8046 

3.8329,  -1.0903 

3.8538,-1.1022 

0.70 , 0..30 

2.8730  ,  -0..5047 

3..3600,  -0.8107 

3.3813,  -0.82.37 

0.75 , 0.25 

2.4067  ,  -0.1885 

2.8875,  -0.5141 

2.9090,  -0.5281 

0.80 , 0.20 

1.9.542 , 0.1490 

2.4155, -0.1947 

2.4.368,  -0.2097 

0.85 , 0.15 

1..5209 , 0.5159 

1.9444,0.1.567 

1.9649,  0.1406 

0.90  ,  0.10 

1.1164,0.9272 

1.4747,  0..5.586 

1.4932,  0..5413' 

0.95  ,  0.05 

0.76.35  ,  1.4178 

1.0074,  1.0583 

1.0222, 1.0398 

1.00  ,  0.00 

O..5.5.5I  ,  2.1.306 

0.5551,  2.1.306 

0.5.551,  2.1306 

The  plots  of  Pareto  optimal  objective  vectors  as  tabulated  above  for 
the  original  and  scaled  problems  as  shown  in  Fig.4  and  Fig.5,  reveal  very 
slight  difference:  with  the  first  objective  scaled,  one  point  on  the  F{xl)  end 
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moves  a  little  further  away.  However,  using  the  quasi-normal  n,  even  this 
slight  nonuniformity  of  distribution  of  Pareto  points  is  eliminated  (see  Fig. 
6).  The  Pareto  points  obtained  using  the  quasi-normal,  independent  of  the 
scale  on  fi,  and  are  tabulated  below: 


Weights 

Objective  values 

0.00,  i.od 

10.0000  ,  -4.0111 

0.05,  0.95 

9.4254  ,  -3.7706 

0.10,  0.90 

8.8546  ,  -3.5276 

0.15,  0.85 

8.2882  ,  -3.28l'8 

0.20,  0.80 

7.7264  ,  -3.0329 

0.25,  0.75 

7.1698  ,  -2.7807 

0.30,  0.70 

6.6189  ,  -2.5247 

0.35,  0.65 

6.0743  ,  -2.2647 

0.40,  0.60 

5.5368  ,  -2.0000 

0.45,  0..5.5 

5.0072  ,  -1.7302 

0.50,  0.50 

4.4866  ,  -1.4546 

0.55,  0.45 

3.9764  ,  -1.1722 

0.60,  0.40 

.3.4781  ,  -0.8820 

0.65,  0.35 

2.9939  ,  -0..5827 

0.70,  0.30 

2.-5266  ,  -0.2724 

0.75,  0.25 

2.0801  ,  0.0514 

0.80,  0.20 

1.6597 , 0.3922 

0.85,  0.15 

1.2740 , 0.7556 

0.90,  0.10 

0.9370  ,  1.1506 

0.95,  0.05 

0.6754  ,  1.-5947 

1.00,  0.00 

0.5551  ,  2.1.306 

The  method  of  linear  combinations  was  run  thrice  on  the  same  problem, 
with  the  weight  vectors  a  assuming  the  same  21  uniformly  spread  values  as 
the  w  vector  above^°. 

When  run  on  the  original  problem,  the  minimizer  of  /2(.t)  was  found  six 
times  for  six  different  a,  and  there  was  a  considerable  gap  ‘in  the  middle’  of 
the  Pareto  set  [see  fig.(7)]. 

With  fi  scaled  by  5,  the  point  found  six  times  earlier  was  found  only  twice“, 

^°The  efficient  solution  scheme,  i.e.,  starting  the  solution  of  a  subproblem  from  the 
optimal  point  of  a  ‘nearby  subproblem’  was  used  here  too. 

Heavily  weighting  the  first  objective  made  the  minimizer  move  away  from 
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but  the  Pareto  optimal  vectors  obtained  were  concentrated  at  the  F(.t|)  end 
and  no  ‘middle  ground  for  compromise’  was  captured  [see  fig. (8)]. 

With  /i  scaled  by  10,  the  point  repeated  earlier  was  found  only  once,  though 
the  clustering  at  the  F(xJ)  end  increased  [see  fig.(9)]. 

The  Pareto  optimal  vectors  obtained  using  linear  combinations  are  tab¬ 
ulated  below: 


Weights 

(ai,a2) 

Objective  values 
(original  scale) 

Objective  values 
(/i  scaled  by  5) 

Objective  values 
(/i  scaled  by  10) 

0.00  ,  1.00 

10.0000  ,  -4.0111 

10.0000, -4.0111 

10.0000,-4.0111 

0.0.5 , 0.95 

10.0000  ,  -4.0111 

10.0000, -4.0111 

4.8211,  -1.6.3.30 

0.10 , 0.90 

10.0000  ,  -4.0111 

4.1857,  -1.2896 

.  1.1634,0.8741 

0.15 , 0.85 

10.0000  ,  -4.0111 

1.6131,  0.4.330 

0.7689,  1.4083 

0.20  ,  0.80 

10.0000  ,  -4.0111 

1.0180,  1.0451 

0.6.559,  1.6416 

0.25  ,  0.75 

10.0000  ,  -4.0111 

0.7975,  1.3.592 

0.6100,  1.7724 

0..30  ,  0.70 

8.9403  ,  -.3..5644 

0.6953,  I..5.5O6 

0.5876,  1.8.563 

0..35 , 0.65 

4.5379  ,  -1.4822 

0.6412,  1.6796 

0.57.54,  1.9146 

0.40 , 0.60 

2.7.307  ,  -0.4109 

0.6100,  1.7725 

0.5682,  1.9576 

0.45 , 0..5“5 

1.8319 , 0.2473 

0..5909,  1.8425 

0.5637,  1.990.5 

0..50  ,  0..50 

1.33.57 , 0.6928 

0.5788,  1.8973 

0..5608,  2.0165 

0..55  ,  0.45 

1.0425  ,  1.0147 

0..5707,  1.9413 

0.5589,  2.0376 

0.60 , 0.40 

0.8615  ,  1. '2.583 

0.5654,  1.977.3 

0.5576,  2.0551 

0.65  ,  0..35 

0.7463  ,  1.4492 

0.5618,2.0075 

0.5567,  2.0698 

0.70 , 0.30 

0.6719  ,  1.6029 

0.5.593,2.0331 

0..5561,  2.0823 

0.75  ,  0.25 

0.62.36  ,  1.7295 

0.5.576,  2.0.551 

0.5.557,  2.0931 

0.80 , 0.20 

0..5926  ,  1.8.3.56 

0..5.565,  2.0741 

0.5.554,  2.1025 

0.85  ,  0.15 

0.57.34  ,  1.92.58 

0.5.5.58,  2.0909 

O..5.5.5.3,  2.1108 

0.90  ,  0.10 

0.5622 , 2.00.35 

0.55.54,  2.1057 

0.5.552,  2.1181 

0.95  ,  0.05 

0..5.567 , 2.0711 

O..5.551,  2.1188 

0.5.551,  2.1247 

1.00 , 0.00 

O..5.5.5I  ,  2.1.306 

0.5.551,  2.1.306 

0.5.551,  2.1.306" 

Clearly,  the  inability  of  the  method  of  linear  combinations  in  suflhciently 
capturing  the  ‘middle  ground’  of  the  Pareto  set  renders  it  fairly  useless  as  a 
means  of  studying  the  tradeoff  between  the  conflicting  objectives. 
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9.1  Function  scaling  implicit  in  NBI 

Even  though  the  NBI  using  the  quasi-normal  component  is  unaffected  by  the 
function  scales,  this  property  comes  with  a  price.  As  the  functions  get  more 
disparately  scaled,  the  Pareto  set  gets  more  ‘stretched’,  and  consequently  the 
NBI  points  get  further  apart  from  each  other.  Consequently,  solving  an  NBI 
subproblem  starting  from  the  solution  of  the  same  nearby  subproblem  takes 
more  iterations  to  converge.  This  was  observed  in  the  numerical  example 
above  and  motivates  the  need  to  scale  the  functions  properly  to  remove  this 
disparity  in  scales. 

Geometrically,  it  can  be  perceived  that  if  the  vertices  of  the  CHIM 
simplex  are  almost  equidistant  from  the  origin,  i.e.  the  quantities 

||F(a;*)-F*i|,  i  =  l,...,n 

are  almost  equal,  then  the  quasi  normal  direction  ft,  is  almost  normal  to  the 
CHIM  simplex.  This  would  achieve  the  ‘minimally  stretched’  Pareto  set  we 
want  and  could  also  be  a  good  scaling  for  the  problem  in  the  sense  that  all 
the  functions  would  be  about  the  same  order  of  magnitude,  and  thus  reduce 
possible  ill-conditioning. 


For  the  biobjective  problem,  $  is  antidiagonal;  thus  a  scaling  that  would 
achieve  the  above  is  obvious: 


fi  ^ 


h 


/2  ^ 


/2 

f2{xiy 


which  gets  each  vertex  of  CHIM  to  be  unit  distance  from  the  origin. 


However,  the  solution  may  not  be  so  transparent  for  more  than  two  ob¬ 
jectives,  and  it  may  not  be  possible  to  get  all  the  vertices  exactly  equidistant 
from  the  origin.  So  now  we  shall  attempt  to  find  function  scalings  di  >  0 
such  that  the  functions  scaled  as 

fi 

will  have  the  property  that  the  variance  among  the  scaled  distances  of  the 
vertices  from  the  origin,  i.e. 

\\VD{F{x;)-F*)f,  i  =  l,...,n 
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will  be  minimized  {D  =  diag{d),  d  represents  the  vector  with  components 
di). 


LetVj  =  \\VDiFix*j)-F*)\\\  i.e., 


'^j  —  ^  di(f>jj, 
i=l 


where  (j)ij  is  the  row  column  entry  of  the  matrix 


Ath 


The  mean  square  distance  of  the  vertices  is  defined  as 

n  n 

ni 


V  ■ 


n  1 

i=l  j=l 

The  variance  quantity  to  be  minimized  is  given  by 


i=i 


i.e.. 


vw  =  E{E  <ii4>h 

j=zl  ^=l  i=l  j—1 

Let  A  be  the  matrix  with  components  Ajj  given  by 


J  —  „  X/ 


k=l 


V{d)  =  j2Ctd^aij)^-, 

i=i  2=1 

y(d)  =  d^A.4^d=||^^d||2. 

This  quadratic  function  is  convex  in  d,  and  has  an  unconstrained  mini- 
mizer  at  d  =  0.  Thus  we  shall  demand  a  specific  value  of  v,  which  represents 
an  average  distance  of  the  CHIM  simplex  from  the  origin^^  and  is  roughly 

Using  the  mean  distance  instead  of  the  mean  square  distance  for  this  constraint  would 
result  in  loss  of  convexity. 
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the  same  order  of  magnitude  as  a  typical  function  value  of  any  objective 
encountered  in  the  computation.  Say  we  want  a  typical  objective  value  to 
be  r,  which  could  be  something  like  10.  Then  we  would  enforce 


j=l 


2  =  1 


along  with  a  small  lower  bound  on  d,-.  Thus  the  optimization  problem  to  be 
solved  to  obtain  our  ‘optimal’  scales  is 

mjn  V  (d)  =  d^AA^d 


s.t.  = 

i—l  j=l 

di  >=  10“®  ,i= 

Thus  we  can  see  how  the  matrix  $  suggests  an  ‘improved  scaling’  of  the 
objective  functions,  which  is  a  bonus  in  the  NBI  approach. 

10  Conclusion 

.4n  algorithm  was  presented  for  finding  Pareto  optimal  points  of  any  smooth, 
constrained  multiobjective  problem  with  essentially  any  number  of  objec¬ 
tives.  One  question  that  is  left  open  is  how'  the  user  would  select  the  final 
design  point  from  the  Pareto  set  generated  by  NBI  (or  any  other  algorithm 
which  generates  the  Pareto  set).  For  two  or  three  objectives,  the  gener¬ 
ated  Pareto  curve/surface  can  be  visualized  with  standard  2-D  or  3-D  plots, 
which  may  be  all  the  user  needs  to  arrive  at  a  final  design  point.  However 
the  visualization  process  may  be  complicated  for  more  than  three  objectives, 
and  how  helpful  it  will  be  in  guiding  the  user  towards  a  better  choice  may 
depend  on  factors  like  the  psychological  aspects  of  the  visualization.  One 
procedure  that  could  perhaps  be  useful  is  to  have  the  user  specify  another 
‘cost’  or  ‘utility’  function,  whose  value  could  be  reported  at  each  of  the 
Pareto  optimal  points  generated  by  NBI,  and  the  user  could  make  his/her 
final  choice  based  on  this  ‘cost’.  Also,  if  there  are  more  than  three  objec¬ 
tives  and  if  it  is  possible  to  set  up  a  hierarchical  order  of  preference  in  blocks 
of  two  or  three  (e.g.  /2,  Ai/s  are  more  important  than  fi.fs),  the  Pareto 
points  for  the  combined  problem  could  be  visualized  for  each  of  the  blocks, 
starting  at  the  most  important,  and  the  user  could  narrow  down  his/her 
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preferences  down  the  blocks. 

Further  research  is  in  progress  regarding  the  above  issue  and  also  re¬ 
garding  the  development  of  efficient  nonlinear  programming  techniques  for 
solving  the  NBI  subproblems  and  parallelizing  the  entire  algorithm. 
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Pareto  points  obtained  using  NBIgeneralS 


F(1) 


Figure  4:  Pareto  optimal  vectors  in  the  objective  space  using  NBI  with 
actual  normal  on  the  original  problem 
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Pareto  points  obtained  using  NBIgeneralS 


F(1) 


Figure  6:  Pareto  optimal  vectors  in  the  objective  space  using  NBI  with 
quasi-normal  on  the  problem  with  /i  scaled  by  10 
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Efficient  points  obtained  by  minimizing  convex  combinations  of  objectives 


Figure  8:  Pareto  optimal  vectors  in  the  objective  space  using  the  method  of 
linear  combinations  on  the  problem  with  /i  scaled  by  5 
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